Self-affine Fractals Embedded in Spectra of Complex Networks 
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The scaling properties of spectra of real world complex networks are studied by using the wavelet 
transform. It is found that the spectra of networks are multifractal. According to the values of 
the long-range correlation exponent, the Hust exponent H, the networks can be classified into three 
types, namely, H > 0.5, H = 0.5 and H < 0.5. All real world networks considered belong to the 
class of H > 0.5, which may be explained by the hierarchical properties. 
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Complex networks have attracted increasing attentions 
in recent years due to their relevance to diverseproblems 
in physical, biological, and social sciences 0,12, 3]. The 
primary purpose is to understand the relations between 
the underlying structures, dynamics and functions. Gen- 
erally, the dynamical processes as the transport of mass, 
energy, signal and/or information occur at differen struc- 
ture scales. The organization patterns at different scales 
may provide a reasonable solution to the problems. 

Song et al.0 found that the World- Wide- Web 
(WWW), social, protein-protein interaction (PPI) and 
cellular networks are fractal under a length-scale trans- 
form, namely, one can define a topological box in which 
the shortest path between each pair is less than lg, the 
size of the box. The fractal behavior implies a power- 
law relation between the minimum number of boxes, Nb , 
needed to cover the entire network and the box size, 
NbQb) ~ Ib ■ ds is the fractal dimension. 

Detailed works have been done on the coverage meth- 
ods || . It is shown that finding the minimum number of 
boxes to cover networks can be mapped to the graph col- 
oring problem in the NP-complete complexity class, and 
the well-established algorithms in the coloring problem 
provide a solution close to optimal. A random burning- 
based algorithm is also proposed due to a number of other 
benefits [y]. 

A network with N identical nodes is described by an 
adjacent matrix A whose elements Aij — 1 or if the 
nodes i and j are connected and disconnected, respec- 
tively. By mapping the nodes and the edges to atoms 
and bonds, the network can be regarded as a large cluster 
Q- The Huckel Hamiltonian of the large cluster reads, 
e ■ I + ij- A, where e and r\ are the site energy and the hop- 
ping integral, respectively. Generally, we can set e = 
and rj = 1, that is, the Hamiltonian is A. The spectrum 
of the network is defined as the rank ordered eigenvalues 
of A, namely, E = {E x < E 2 < ■ ■ ■ < E N }. 



The topological structure of the network determines 
the spectrum. The invariance properties embedded in 
the spectrum in turn reflect the topological symmetries of 
the network. It is well known that the fractal structures 
of aperiodic crystals lead to the fractal behaviors of the 
corresponding spectra (For a detailed review, see Ref. 
Q and the references therein). An interesting question 
is then, how the fractal structures of networks affect the 
corresponding spectra. In this paper, we shall detect the 
scaling properties embedded in spectra of networks. 

The wavelet transform (WT) @ is used 
to detect the scaling properties. We con- 

sider the nearest neighbor level spacing se- 
ries L = {Li = E l+1 - Ei,i = 1,2, • • ■ ,N - I}. 
The WT of the series L can be calculated as, 

T ( s '°) = \Y.vJi L i -9 (^)- 9 is the wavelet, a 
the given scale. The wavelet transform can remove 
effectively polynomial trends along the series. 

The series under consideration can be decomposed into 
many subsets characterized by different local Hurst ex- 
ponents, which quantify the local singular behavior and 
thus relate to the local scaling of the series. Tradition- 
ally, the local Hurst exponents are evaluated through the 
modulus of the maximal values of T(s, a) at each point 
in the series. We denote the positions of the WT max- 
imum with {si, 82, • ■ ■ sm}- In the long scale limit, the 
partition function is expressed as, 



Z(a,q)= £ \T(s 



(1) 
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For positive and negative q, r(q) reflects the scalings of 
the large and small fluctuations, respectively. 

If r(q) is a straight line, the analyzed series contains 
only linear correlations (monofractal) and its slope repre- 
sents the Hurst exponent. If r(q) is a nonlinear function, 
the series is called multifractal, since different subsets 
of the series exhibit different local Hurst exponents. In 
order to characterize this multifractal, one considers the 
fractal dimensions of the subsets of the series that is char- 
acterized by a(q), which is related to r(q) by a Legendre 
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transorm, D{h) = qh — r(q),h = dT j^ . The width of 
this function for q — > ±00 is a measure for the strength 
of mult ifr act al, Aa = a max — ctmin- 

However, the numerical derivative of r(q) in this 
method may induce unacceptable errors to Aa. Thus, 
we employ a functional form fitted to r(g) suggested by 
Kantelhardt et al. 



T ( q ) = -\ n {x q + y q )/\n2. (2) 

The distribution width of the Hurst exponent is given by, 

Aa = I In a; - hay I /In 2. (3) 

Sometimes the bifractal is required to obtain Aa. For a 
bifractal series r(q) is characterized by two distinct slopes 
ai and a 2 , 



r{q) 



qati — 1 q <q x 

qa 2 + q x (oci - a 2 ) - 1 q > q x 



(4) 
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We can obtain the multifractal strength, Aa = |ai — 0:2 1- 
These forms can be derived from a modification of the 
multiplicative cascade model [TlT |. 

In the multifractal case, one conventionally refers to 
the second moment as Hurst exponent, i.e., 



£T=(r(2) + l)/2. 



(6) 



For H > 0.5, the levels will tend to form local clus- 
ters with small level spacings in different scales, while 
for H < 0.5 these clusters can not be formed. The 
critical value of H = 0.5 corresponds to a series that 
the corresponding integrated series behaves like a ran- 
dom walk. These characteristics are induced obviously by 
the structures of networks generated by different mech- 
anisms. Therefore, the exponent H can be used as a 
criterion to classify networks into three categories with 
H < 0.5, H = 0.5 and H > 0.5, respectively. 

Theoretically, we should have r(0) = —1 while the 
calculated values may deviate slightly from it. The devi- 
ation Ato = 1 — |t(0)| can be used as the estimation of 
the error of t(2). The corresponding error of H is, 



SH = At(0)/2. 



(7) 



In each application reported below we have used the 
real analytic wavelet g^ n ' among the class of derivatives 
of the Gaussian function. The polynomial trends up to 
n order can be removed. We present the results by using 
the parameter value n = 4. Calculations with higher 
orders (n = 5 and 6) lead to almost the same results. 



FIG. 1: (Color online) Self-affine properties embedded in 
spectra of real world networks. (a,b) histograms of the 
levels of the E. Coli cellular network and the actor sub- 
network (containing the nodes numbered 1-8,000). (a',b') 
partition functions. (a",b") scaling exponent r q as a func- 
tion of q. The E. Coli cellular network behaves multifrac- 
tal. The actor subnetwork behaves bifractal, and Eq.(5) 
is used to obtain Aa. (c) The relations of r q versus q for 
the high-confident, low-confident and artificial versions of 
the S. cerevisiae protein-protein interaction network. High- 
confidence may not necessarily imply high quality. The 
result for the artificial network is an average over 20 real- 
izations. A dashed line is added as reference, the slope of 
which is 0.66. The partition functions are shifted to avoid 
overlapping. 



Randomizing L, we detect also the scaling behaviors 
embedded in the resulting series (called shuffled scries) 
as a comparison. The partition function Z(a, q) are cal- 
culated by using the software provided in PhysioTookit 
[i"2| . The integrated series, i.e., the spectrum E is used 
as the input data. The relation in Eq.(6) is also checked 
by using the DFA software. 

We examine the scaling behaviors for the spectra of 
some real world networks [4| . The cellular networks con- 
sider the cellular functions as intermediate metabolism 
and bioenergetics, information pathways, electron trans- 
port, and transmembrane transport. The direct edges 
are replaced simply with non-directed edges. Generally, 
we find the spectra of real world networks to be multi- 
fractal. We present in Fig.l(a)-(a" ') the result for the E. 
Coli cellular network. The Hurst exponent distributes 
in a wide range, Aa = 1.0. The long-range correla- 
tion exponent is 0.5. For the actor network, we consider 
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only the subnetwork containing the nodes numbered 1- 
8,000. The spectrum for this network behaves bifractal, 
and the long-range correlation exponent is 0.75, as given 
in Fig.l(b)-(b"). 

For the S. cerevisiae protein-protein interaction net- 
work, we consider two versions of the database. One 
is investigated by b 4], which contains 1381 nodes and 
2493 edges. The other one is from [l3j, which has 1037 
nodes and 1058 edges. The edges in this version are 
high-confident. They are called low-confident and high- 
confident networks, respectively. As shown in Fig. 1(c), 
the addition of the so-called low-confident edges in the 
low-confident network makes r(g) versus q significantly 
closer to a linear relation. The slope of the black dashed 
line is H = 0.66. 

This change of the relations of r q versus q for the high- 
confident and low-confident networks may be caused sim- 
ply by a size-effect. To exclude the size-effect, we con- 
sider also some artificial networks, in which the same 
number of random edges and nodes are added to the 
high-confident network. Starting from the high-confident 
network, at each step a new node is added by connecting 
it with a randomly selected node in the existing network. 
When the size of the network reaches 1381, we add edges 
between randomly selected pairs of nodes until the total 
number of edges is 2493. The resulting network has the 
same numbers of edges and nodes with the low-confident 
network. 

The randomly added edges and nodes in the artifi- 
cial networks do not lead to similar result. This com- 
parison may prefer to support the conjecture that an 
exactly constructed protein interaction network behaves 
perfectly fractal. The deviation of the actual structure 
from the perfect fractal is due to the incompleteness 
of the databases which are continuously being updated 
with newly discovered physical interactions. The high- 
confidence may not necessarily imply high-quality. 

The scaling characteristics for the real world networks 
are listed in Table I. We present only the results of net- 
works whose partition functions meet the scaling relation 
in Eq.(l) in a wide range of q, namely Aq > 5. Inter- 
estingly, we find that the values of H for the real world 
networks are in the range of H > 0.41. Taking note of 
the values of error estimations 5H , as presented in Table 
I, we have H > 0.41 w 0.5. 

The hierarchical property may be helpful in under- 
standing the fact that H > 0.5 for the real world net- 
works. In the present paper, we use the definition of 
hierarchy proposed in [15}. That is, for a hierarchical 
network, besides the small-world and scale-free charac- 
teristics, there exists a simple relation between the clus- 
tering coefficient C and the degree k, C(k) ~ Our 
detailed calculations show that all the considered real 
world networks are hierarchical in this sense. 

For the Watts-Strogatz small- world (WSSW) networks 
Q, we can construct a regular circle lattice, with each 
node connected with its d right-handed nearest neigh- 
bors. Each edge is rewired with probability p r to another 



randomly selected node. 

As for the Barabasi- Albert scale-free (BASF) networks 
[|[, we start from several connected nodes as a seed, at 
each step we add a new node and w edges from the new 
node to different preferentially selected nodes in the ex- 
isting network. The probability for a node being selected 
is proportional to its degree. 

The results for the constructed networks are listed in 
Table I. The sizes of the networks are 2, 000. And the pa- 
rameter d is assigned 2. The values of H for the WSSW 
networks are in the range of 0.15 ~ 0.31. For the BASF 
networks, with the increase of w the small-world effect 
becomes more and more significant and the value of H 
decreases rapidly from 0.5 (w = 2) to < 0.2 (w > 3). 
Hence, for the real-world networks, the hierarchy is es- 
sential for the values of H being larger than 0.5. 

The values of H for the shuffled series are almost ex- 
actly 0.5. And the corrections due to the fluctuations 
At(0) are neglectable. The WSSW and BASF networks 
with sizes 4, 000, 6, 000 and 8, 000 have similar character- 
istics (not shown in Table I). 

In summary, we have found self-affine fractals embed- 
ded in spectra of complex networks. For the real world 
networks considered in the present work, the values of 
the long-range correlation exponents are in the range of 
H > 0.5, which may be attributed to the hierarchical 
properties in the sense of a dependence of clustering on 
the degree. This evidence may support the idea that 
fractals in topological structures induce the fractals in 
spectra of networks. 

For the constructed BASF networks, which have not 
box-based fractal structures, we have also found rich mul- 
tifractal structures in the spectra. However, the values of 
H are all significantly smaller than 0.5 for networks with 
w > 3. There may exist a new kind of scale-invariance 
in the topological structures rather than the box-based 
fractals in the constructed networks. 

One paradox may be raised that the box-based frac- 
tal can be explained with degree-degree anti-correlations 
[U, while we find the positive correlations in the spectra 
(H > 0.5) for the real world networks. Because of the 
degree-degree anti-correlations, the nodes tend to aggre- 
gate into many small-sized structure clusters with the 
hubs as centers. And there exist loosely connections be- 
tween the clusters. There are strong "repulsive effects" 
between the levels within each cluster, but the levels for 
different clusters may be very close or even degenerate. 
That is, there will appear some locals with high density 
of levels in the spectrum, called level clusters. We can 
expect H > 0.5 (positive correlations in spectra) for this 
kind of networks. 

While for the BASF networks with w > 3, since the 
strong correlations between the hubs, the clusters cen- 
tered at the hubs will merge into a small number of large- 
sized clusters. The strong " repulsive effects" between the 
levels make the so-called "clustering of levels" impossible. 
Consequently, the spectra are anti-correlated (H < 0.5). 

Hence, the difference of our results with the box-based 
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results is not necessarily a contradiction. Obviously, 
the relation between the self-afhne behaviors of spectra 
and the fractal dimension based upon box-counting ap- 
proaches deserves further investigation. 

Network comparison is an important topic in systems 
biology. It can shed light on the evolutionary and dis- 
eases detecting by comparing cellular networks of differ- 
ent species or diseased and healthy cellular networks • 
One basic task is to design node labeling-independent 



representations of networks and circumvent the problem 
of graph isomorphism. Spectra analysis of complex net- 
works may provide useful information for that purpose. 
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PPI [12] 


D .melanogaster 


{0.51,0.66, 0.40} 


0.66/0.01 


p r = 0.12 


0.76/0.95/0.33 


0.22/0.01 


C. elegans 


0.41/0.67/0.73 


0.85/0.07 


P r = 0.15 


0.66/1.00/0.61 


0.24/0.01 


Cellular [3] 


B. burgdorferi 


0.61/0.61/0.00 


0.72/0.07 


P r = 0.21 


0.76/1.00/0.40 


0.17/0.01 


A. aeolicus 


0.42/0.80/0.94 


0.64/0.01 


p r = 0.24 


0.70/1.00/0.50 


0.21/0.01 


C. elegans 


0.37/0.93/1.35 


0.50/0.03 


Pr = 0.27 


0.72/1.00/0.47 


0.20/0.01 


E. coh 


0.45/0.89/1.00 


0.50/0.04 
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0.73/1.00/0.46 
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0.45/0.83/0.89 


0.59/0.08 


BASF 


w = 2 


0.71/0.71/0.00 


0.50/0.02 


M. leprae 


0.51/0.88/0.77 


0.47/0.04 


w = 3 


{1.05,0.25,-1.13} 


0.25/0.01 


P. aeruginosa 


0.43/0.93/1.12 


0.46/0.02 


w = 4 


0.77/1.00/0.39 


0.17/0.00 


S. typhi 


0.42/0.96/1.18 


0.43/0.04 


w — 5 


0.76/1.00/0.40 


0.17/0.02 


T. pallidum 


0.38/0.82/1.11 


0.65/0.07 


w — 6 


0.74/1.00/0.43 


0.18/0.01 


Y. pestis 


0.54/0.87/0.69 


0.47/0.03 


w = 7 


0.74/1.00/0.44 


0.19/0.01 


C. pneumoniae 


0.63/0.86/0.44 


0.41/0.07 


w = 8 


0.79/1.00/0.35 


0.15/0.01 



TABLE I: The self-affine fractals embedded in spectra of real world, WSSW and BASF networks. For the real world networks 
the values of H are basically in the range of H > 0.45 ~ 0.5, while that for WSSW and BASF networks are significantly smaller, 
namely, H < 0.3. We present only the results for networks whose partition functions meet the scaling relation in Eq.(2) in a 
wide range of q, namely Aq > 5. 
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